Parallel Matrix Multiplication : 2 D and 3 D FLAME Working Note # 62 Martin Schatz

نویسندگان

  • Martin Schatz
  • Jack Poulson
  • Robert van de Geijn
چکیده

We describe an extension of the Scalable Universal Matrix Multiplication Algorithms (SUMMA) from 2D to 3D process grids; the underlying idea is to lower the communication volume through storing redundant copies of one or more matrices. While SUMMA was originally introduced for block-wise matrix distributions, so that most of its communication was within broadcasts, this paper focuses on element-wise matrix distributions, which lead to allgather-based algorithms. We begin by describing an allgather-based 2D SUMMA, describe its generalization to 3D process grids, and then discuss theoretical and experimental performance benefits of the new algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Matrix Multiplication: A Systematic Journey

We expose a systematic approach for developing distributed memory parallel matrix matrix multiplication algorithms. The journey starts with a description of how matrices are distributed to meshes of nodes (e.g., MPI processes), relates these distributions to scalable parallel implementation of matrix-vector multiplication and rank-1 update, continues on to reveal a family of matrix-matrix multi...

متن کامل

A New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure

The objective of this study was to develop a new optimal parallel algorithm for matrix multiplication which could run on a Fibonacci Hypercube structure. Most of the popular algorithms for parallel matrix multiplication can not run on Fibonacci Hypercube structure, therefore giving a method that can be run on all structures especially Fibonacci Hypercube structure is necessary for parallel matr...

متن کامل

Solving ‎F‎ully Fuzzy Dual Matrix System With Optimization Problem

In this paper, the fuzzy dual matrix system as AX + B = CX + D in which A, B, C, D, X are LR fuzzy matrices is studied. At first we solve 1-cut system in order to find the core of LR fuzzy solution; then to obtain the spreads of the LR fuzzy solution, we discuss in several cases. The spreads are obtained by using multiplication, quasi norm and minimization problem with a special objective funct...

متن کامل

Microwave-assisted synthesis of SiO2 nanoparticles and its application on the flame retardancy of poly styrene and poly carbonate nanocomposites

Various morphologies of silica nanoparticles were synthesized by a microwave-assisted Pechini method. Silica nanostructures were synthesized via a fast reaction between tetra ethyl ortho silicate and ammonia at presence citric acid and other effective agents in Pechini procedure. Then for preparation of polymer-matrix nanocomposites, SiO2 nanoparticles were added to poly carbonate (PC) and poly...

متن کامل

Optimal Gram-schmidt Type Algorithms

A Gram-Schmidt type algorithm is given for finite d-dimensional reflexive forms over division rings. The algorithm uses d/3 + O(d) ring operations. Next, that algorithm is adapted in two new directions. First a sequential algorithm is given whose complexity matches the complexity of matrix multiplication. Second, a parallel NC algorithm is given with similar complexity.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012